154 research outputs found

    Advances on Time Series Analysis using Elastic Measures of Similarity

    Get PDF
    A sequence is a collection of data instances arranged in a structured manner. When this arrangement is held in the time domain, sequences are instead referred to as time series. As such, each observation in a time series represents an observation drawn from an underlying process, produced at a specific time instant. However, other type of data indexing structures, such as space- or threshold-based arrangements are possible. Data points that compose a time series are often correlated with each other. To account for this correlation in data mining tasks, time series are usually studied as a whole data object rather than as a collection of independent observations. In this context, techniques for time series analysis aim at analyzing this type of data structures by applying specific approaches developed to leverage intrinsic properties of the time series for a wide range of problems, such as classification, clustering and other tasks alike. The development of monitoring and storage devices has made time se- ries analysis proliferate in numerous application fields, including medicine, economics, manufacturing and telecommunications, among others. Over the years, the community has gathered efforts towards the development of new data-based techniques for time series analysis suited to address the problems and needs of such application fields. In the related literature, such techniques can be divided in three main groups: feature-, model- and distance-based methods. The first group (feature-based) transforms time series into a collection of features, which are then used by conventional learning algorithms to provide solutions to the task under consideration. In contrast, methods belonging to the second group (model-based) assume that each time series is drawn from a generative model, which is then har- nessed to elicit knowledge from data. Finally, distance-based techniques operate directly on raw time series. To this end, these methods resort to specially defined measures of distance or similarity for comparing time series, without requiring any further processing. Among them, elastic sim- ilarity measures (e.g., dynamic time warping and edit distance) compute the closeness between two sequences by finding the best alignment between them, disregarding differences in time, and thus focusing exclusively on shape differences. This Thesis presents several contributions to the field of distance-based techniques for time series analysis, namely: i) a novel multi-dimensional elastic similarity learning method for time series classification; ii) an adap- tation of elastic measures to streaming time series scenarios; and iii) the use of distance-based time series analysis to make machine learning meth- ods for image classification robust against adversarial attacks. Throughout the Thesis, each contribution is framed within its related state of the art, explained in detail and empirically evaluated. The obtained results lead to new insights on the application of distance-based time series methods for the considered scenarios, and motivates research directions that highlight the vibrant momentum of this research area

    On-Line Dynamic Time Warping for Streaming Time Series

    Get PDF
    Dynamic Time Warping is a well-known measure of dissimilarity between time series. Due to its flexibility to deal with non-linear distortions along the time axis, this measure has been widely utilized in machine learning models for this particular kind of data. Nowadays, the proliferation of streaming data sources has ignited the interest and attention of the scientific community around on-line learning models. In this work, we naturally adapt Dynamic Time Warping to the on-line learning setting. Specifically, we propose a novel on-line measure of dissimilarity for streaming time series which combines a warp constraint and a weighted memory mechanism to simplify the time series alignment and adapt to non-stationary data intervals along time. Computer simulations are analyzed and discussed so as to shed light on the performance and complexity of the proposed measure

    On-line Elastic Similarity Measures for time series

    Get PDF
    The way similarity is measured among time series is of paramount importance in many data mining and machine learning tasks. For instance, Elastic Similarity Measures are widely used to determine whether two time series are similar to each other. Indeed, in off-line time series mining, these measures have been shown to be very effective due to their ability to handle time distortions and mitigate their effect on the resulting distance. In the on-line setting, where available data increase continuously over time and not necessary in a stationary manner, stream mining approaches are required to be fast with limited memory consumption and capable of adapting to different stationary intervals. In this sense, the computational complexity of Elastic Similarity Measures and their lack of flexibility to accommodate different stationary intervals, make these similarity measures incompatible with the requirements mentioned. To overcome these issues, this paper adapts the family of Elastic Similarity Measures – which includes Dynamic Time Warping, Edit Distance, Edit Distance for Real Sequences and Edit Distance with Real Penalty – to the on-line setting. The proposed adaptation is based on two main ideas: a forgetting mechanism and the incremental computation. The former makes the similarity consistent with streaming time series characteristics by giving more importance to recent observations, whereas the latter reduces the computational complexity by avoiding unnecessary computations. In order to assess the behavior of the proposed similarity measure in on-line settings, two different experiments have been carried out. The first aims at showing the efficiency of the proposed adaptation, to do so we calculate and compare the computation time for the elastic measures and their on-line adaptation. By analyzing the results drawn from a distance-based streaming machine learning model, the second experiment intends to show the effect of the forgetting mechanism on the resulting similarity value. The experimentation shows, for the aforementioned Elastic Similarity Measures, that the proposed adaptation meets the memory, computational complexity and flexibility constraints imposed by streaming data

    Detection of non-technical losses in smart meter data based on load curve profiling and time series analysis

    Get PDF
    The advent and progressive deployment of the so-called Smart Grid has unleashed a profitable portfolio of new possibilities for an efficient management of the low-voltage distribution network supported by the introduction of information and communication technologies to exploit its digitalization. Among all such possibilities this work focuses on the detection of anomalous energy consumption traces: disregarding whether they are due to malfunctioning metering equipment or fraudulent purposes, strong efforts are invested by utilities to detect such outlying events and address them to optimize the power distribution and avoid significant income costs. In this context this manuscript introduce a novel algorithmic approach for the identification of consumption outliers in Smart Grids that relies on concepts from probabilistic data mining and time series analysis. A key ingredient of the proposed technique is its ability to accommodate time irregularities – shifts and warps – in the consumption habits of the user by concentrating on the shape of the consumption rather than on its temporal properties. Simulation results over real data from a Spanish utility are presented and discussed, from where it is concluded that the proposed approach excels at detecting different outlier cases emulated on the aforementioned consumption traces.Ministerio de Energía y Competitividad under the RETOS program (OSIRIS project, grant ref. RTC-2014-1556-3)

    On-Farm Information: A Valuable Tool for the Sustainable Management of Mountain Pastures in Protected Natural Areas

    Get PDF
    Mountain pastures have traditionally been maintained by livestock. The analysis of data concerning farms\u27 characteristics, productive-reproductive management and land use of commercial farms can constitute a real approach to study these systems and the changes that are occurring. This information is necessary to develop new utilisation guidelines, making compatible livestock production and conservation of natural resources. This paper describes a methodological framework to study the issues described above through some examples taken out from a wider research project (Mandaluniz et al., 2003)

    First measurement of neutrino oscillation parameters using neutrinos and antineutrinos by NOvA

    Get PDF
    The NOvA experiment has seen a 4.4σ signal of ν̄e appearance in a 2 GeV ν̄μ beam at a distance of 810 km. Using 12.33×1020 protons on target delivered to the Fermilab NuMI neutrino beamline, the experiment recorded 27 ν̄μ→ν̄e candidates with a background of 10.3 and 102 ν̄μ→ν̄μ candidates. This new antineutrino data are combined with neutrino data to measure the parameters |Δm322|=2.48-0.06+0.11×10-3 eV2/c4 and sin2θ23 in the ranges from (0.53-0.60) and (0.45-0.48) in the normal neutrino mass hierarchy. The data exclude most values near δCP=π/2 for the inverted mass hierarchy by more than 3σ and favor the normal neutrino mass hierarchy by 1.9σ and θ23 values in the upper octant by 1.6σ

    First measurement of neutrino oscillation parameters using neutrinos and antineutrinos by NOvA

    Get PDF
    The NOvA experiment has seen a 4.4 σ signal of ¯ ν e appearance in a 2 GeV ¯ ν μ beam at a distance of 810 km. Using 12.33 × 10 20 protons on target delivered to the Fermilab NuMI neutrino beamline, the experiment recorded 27 ¯ ν μ → ¯ ν e candidates with a background of 10.3 and 102 ¯ ν μ → ¯ ν μ candidates. This new antineutrino data are combined with neutrino data to measure the parameters | Δ m 2 32 | = 2.4 8 + 0.11 − 0.06 × 10 − 3     eV 2 / c 4 and sin 2 θ 23 in the ranges from (0.53–0.60) and (0.45–0.48) in the normal neutrino mass hierarchy. The data exclude most values near δ C P = π / 2 for the inverted mass hierarchy by more than 3 σ and favor the normal neutrino mass hierarchy by 1.9 σ and θ 23 values in the upper octant by 1.6 σ

    Supernova neutrino detection in NOvA

    Get PDF
    The NOvA long-baseline neutrino experiment uses a pair of large, segmented, liquid-scintillator calorimeters to study neutrino oscillations, using GeV-scale neutrinos from the Fermilab NuMI beam. These detectors are also sensitive to the flux of neutrinos which are emitted during a core-collapse supernova through inverse beta decay interactions on carbon at energies of O(10 MeV). This signature provides a means to study the dominant mode of energy release for a core-collapse supernova occurring in our galaxy. We describe the data-driven software trigger system developed and employed by the NOvA experiment to identify and record neutrino data from nearby galactic supernovae. This technique has been used by NOvA to self-trigger on potential core-collapse supernovae in our galaxy, with an estimated sensitivity reaching out to 10 kpc distance while achieving a detection efficiency of 23% to 49% for supernovae from progenitor stars with masses of 9.6 M_⊙ to 27 M_⊙, respectively

    Search for multimessenger signals in NOvA coincident with LIGO/Virgo detections

    Get PDF
    Using the NOvA neutrino detectors, a broad search has been performed for any signal coincident with 28 gravitational wave events detected by the LIGO/Virgo Collaboration between September 2015 and July 2019. For all of these events, NOvA is sensitive to possible arrival of neutrinos and cosmic rays of GeV and higher energies. For five (seven) events in the NOvA Far (Near) Detector, timely public alerts from the LIGO/Virgo Collaboration allowed recording of MeV-scale events. No signal candidates were found
    corecore